AITopics

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.48)

Neural Information Processing SystemsFeb-8-2026, 06:15:26 GMT

448d5eda79895153938a8431919f4c9f-Supplemental.pdf

benchmark, environment interaction, obstacle, (15 more...)

Industry: Transportation (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.48)

Neural Information Processing SystemsFeb-7-2026, 16:06:46 GMT

Object-AwareRegularizationfor AddressingCausalConfusioninImitationLearning

Behavioral cloning has proven to be effective for learning sequential decisionmaking policies fromexpertdemonstrations.

expert demonstration, machine learning, reinforcement learning, (15 more...)

Country: Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Robots (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.69)

Neural Information Processing SystemsDec-23-2025, 19:48:14 GMT

Visual Adversarial Imitation Learning using Variational Models

Reward function specification, which requires considerable human effort and iteration, remains a major impediment for learning behaviors through deep reinforcement learning. In contrast, providing visual demonstrations of desired behaviors presents an easier and more natural way to teach agents. We consider a setting where an agent is provided a fixed dataset of visual demonstrations illustrating how to perform a task, and must learn to solve the task using the provided demonstrations and unsupervised environment interactions. This setting presents a number of challenges including representation learning for visual observations, sample complexity due to high dimensional spaces, and learning instability due to the lack of a fixed reward or learning signal. Towards addressing these challenges, we develop a variational model-based adversarial imitation learning (V-MAIL) algorithm. The model-based approach provides a strong signal for representation learning, enables sample efficiency, and improves the stability of adversarial training by enabling on-policy learning. Through experiments involving several vision-based locomotion and manipulation tasks, we find that V-MAIL learns successful visuomotor policies in a sample-efficient manner, has better stability compared to prior work, and also achieves higher asymptotic performance. We further find that by transferring the learned models, V-MAIL can learn new tasks from visual demonstrations without any additional environment interactions. All results including videos can be found online at https://sites.google.com/view/variational-mail

name change, visual adversarial imitation learning, visual demonstration, (7 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.59)

Sen, Sayambhu, Bhatnagar, Shalabh

Enabling Off-Policy Imitation Learning with Deep Actor Critic Stabilization

arXiv.org Artificial IntelligenceNov-11-2025

Learning complex policies with Reinforcement Learning (RL) is often hindered by instability and slow convergence, a problem exacerbated by the difficulty of reward engineering. Imitation Learning (IL) from expert demonstrations bypasses this reliance on rewards. However, state-of-the-art IL methods, exemplified by Generative Adversarial Imitation Learning (GAIL)Ho et. al, suffer from severe sample inefficiency. This is a direct consequence of their foundational on-policy algorithms, such as TRPO Schulman et.al. In this work, we introduce an adversarial imitation learning algorithm that incorporates off-policy learning to improve sample efficiency. By combining an off-policy framework with auxiliary techniques specifically, double Q network based stabilization and value learning without reward function inference we demonstrate a reduction in the samples required to robustly match expert behavior.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

2511.07288

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

arXiv.org Artificial IntelligenceOct-21-2025

MoRe-ERL: Learning Motion Residuals using Episodic Reinforcement Learning

Huang, Xi, Zhou, Hongyi, Li, Ge, Tang, Yucheng, Liao, Weiran, Hein, Björn, Asfour, Tamim, Lioutikov, Rudolf

Abstract--We propose MoRe-ERL, a framework that combines Episodic Reinforcement Learning (ERL) and residual learning, which refines preplanned reference trajectories into safe, feasible, and efficient task-specific trajectories. This framework is general enough to incorporate into arbitrary ERL methods and motion generators seamlessly. MoRe-ERL identifies trajectory segments requiring modification while preserving critical task-related maneuvers. Then it generates smooth residual adjustments using B-Spline-based movement primitives to ensure adaptability to dynamic task contexts and smoothness in trajectory refinement. Experimental results demonstrate that residual learning significantly outperforms training from scratch using ERL methods, achieving superior sample efficiency and task performance. Hardware evaluations further validate the framework, showing that policies trained in simulation can be directly deployed in real-world systems, exhibiting a minimal sim-to-real gap. OBOTIC applications, such as multi-arm cooperation, often require frequent motion adaptation to ensure safety and task efficiency.

machine learning, reinforcement learning, trajectory, (15 more...)

2508.01409

Country: Europe > Germany (0.28)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Neural Information Processing SystemsOct-9-2025, 14:09:55 GMT

A Safely Imitating a Neural Policy

Here we provide proofs of the theoretical results from Section 3.2 and extend the discussion of a few

artificial intelligence, benchmark, machine learning, (17 more...)

Industry: Transportation (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.48)

arXiv.org Artificial IntelligenceOct-2-2025

Learning Hierarchical Domain Models Through Environment-Grounded Interaction

Kienle, Claudius, Alt, Benjamin, Arenz, Oleg, Peters, Jan

Domain models enable autonomous agents to solve long-horizon tasks by producing interpretable plans. However, in open-world environments, a single general domain model cannot capture the variety of tasks, so agents must generate suitable task-specific models on the fly. Large Language Models (LLMs), with their implicit common knowledge, can generate such domains, but suffer from high error rates that limit their applicability. Hence, related work relies on extensive human feed-back or prior knowledge, which undermines autonomous, open-world deployment. In this work, we propose LODGE, a framework for autonomous domain learning from LLMs and environment grounding. LODGE builds on hierarchical abstractions and automated simulations to identify and correct inconsistencies between abstraction layers and between the model and environment. Our framework is task-agnostic, as it generates predicates, operators, and their preconditions and effects, while only assuming access to a simulator and a set of generic, executable low-level skills. Experiments on two International Planning Competition ( IPC) domains and a robotic assembly domain show that LODGE yields more accurate domain models and higher task success than existing methods, requiring remarkably few environment interactions and no human feedback or demonstrations.

artificial intelligence, large language model, natural language, (19 more...)

2505.13497

Country: Europe > Germany > Bremen > Bremen (0.28)

Genre: Research Report (0.50)

Industry: Education > Educational Setting (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

arXiv.org Artificial IntelligenceAug-27-2025

Playstyle and Artificial Intelligence: An Initial Blueprint Through the Lens of Video Games

Lin, Chiu-Chou

Contemporary artificial intelligence (AI) development largely centers on rational decision-making, valued for its measurability and suitability for objective evaluation. Y et in real-world contexts, an intelligent agent's decisions are shaped not only by logic but also by deeper influences such as beliefs, values, and preferences. The diversity of human decision-making styles emerges from these differences, highlighting that "style" is an essential but often overlooked dimension of intelligence. This dissertation introduces playstyle as an alternative lens for observing and analyzing the decision-making behavior of intelligent agents, and examines its foundational meaning and historical context from a philosophical perspective. By analyzing how beliefs and values drive intentions and actions, we construct a two-tier framework for style formation: the external interaction loop with the environment and the internal cognitive loop of deliberation. On this basis, we formalize style-related characteristics and propose measurable indicators such as style capacity, style popularity, and evolutionary dynamics. The study focuses on three core research directions: (1) Defining and measuring playstyle, proposing a general playstyle metric based on discretized state spaces, and extending it to quantify strategic diversity and competitive balance; (2) Expressing and generating playstyle, exploring how reinforcement learning and imitation learning can be used to train agents exhibiting specific stylistic tendencies, and introducing a novel approach for human-like style learning and modeling; and (3) Practical applications, analyzing the potential of these techniques in domains such as game design and interactive entertainment. Finally, the dissertation outlines future extensions, including the role of style as a core element in building artificial general intelligence (AGI). By investigating stylistic variation, we aim to rethink autonomy, value expression, and even offer a tangible perspective on the ultimate i philosophical question: What is the soul?

artificial intelligence, machine learning, playstyle intersection similarity, (20 more...)